Polycomb genes from Maize-Mez1 and Mez2

ABSTRACT

The present invention relates to polycomb genes and polypeptides isolated from  Zea mays.

RELATED APPLICATION INFORMATION

This application claims priority from U.S. Ser. No. 60/218,745 filed onJul. 17, 2000.

TECHNICAL FIELD OF THE INVENTION

The present invention relates to plant genetic engineering. Morespecifically, the present invention relates to polycomb nucleic acidscloned from Zea mays L.

BACKGROUND OF THE INVENTION

In eukaryotes, gene expression patterns are regulated in response todevelopmental and environmental cues. These changes in gene expressionpatterns are often the result of specific transcriptional regulators. Inmany cases, this change in gene expression must be stably maintainedthrough many mitotic cell divisions even though the transcriptionalregulator that effected the change in expression is only presenttransiently. The stable maintenance of a transcription state isperformed by a set of nonspecific factors. These factors are importantin regulating chromatin states and establishing a chromatin “memory” toeffectively maintain the proper gene expression patterns. In Drosophila,the Polycomb group, PcG, genes are involved in nonspecific, long-termstabilization of transcriptional repression. Recently, homologs of someof the polycomb group genes have been shown to affect developmental generegulation in other species.

There are at least thirteen PcG proteins in Drosophila. Mutations in anyof the thirteen identified PcG genes can lead to lethality during earlydevelopment (See, Simon, J., Current Opinion in Cell Biology,7(3):376-85 (1995); Pirrotta, V., Curr. Opin. Gen. Dev., 7(2):249-58(1997); Pirrotta, V., Cell, 93(3):333-6 (1998)). The cause of thislethality is the failure to maintain transcriptional repression ofhomeotic genes of the Antennopedia/bithorax complex. The expressionpattern of these homeotic genes is controlled in the embryo byactivators and repressors that define body segments. Duringgastrulation, these specific factors are no longer present and PcGprotein complexes stabilize a silenced state at genes repressed by thespecific factors. Importantly, PcG complexes silence different targetsin different cell lineages. This indicates that PcG complexes are ableto silence based on factors such as transcription state and not just onsequence. An antagonistic set of factors which maintain the activetranscriptional state, the trithorax group, also exist in Drosophila.

In addition to playing a role in developmentally regulated repression ofgene expression, the PcG proteins are also involved in maintaining asilenced state at other loci. When high copy numbers (>3) of a white-Adhtransgene are introduced into the Drosophila genome the level ofwhite-Adh expression becomes reduced via cosuppression (Pal-Bhadra etal., Cell, 90:479-490 (1997)). In addition to reductions in theexpression of the transgenes, the expression of the endogenous Adh geneis reduced as well. This cosuppression is relieved by mutations inpolycomb (Pc) or polycomblike (pcl). The cosuppression is based on ahomology sensing mechanism that leads to repression via PcG proteins(Pal-Bhadra et al., Cell, 99:35-46 (1999)). The PcG protein, enhancer ofzeste, E(z), is required for trans-silencing of P-elements (Roche etal., Genetics, 149(4):1839-55 (1998)). Increased expression of E(z) orthe human homolog (EZH2) results in enhancing position effectvariegation (PEV) of a heterochromatin associated white locus (Laible etal., EMBO J., 16(11) 3219-32 (1997)). The EZH2 gene was also able torestore telomere mediated gene repression in S. cerevisiae (Laible etal., EMBO J., 16(11) 3219-32 (1997)). These studies suggest that the PcGproteins can play a role in epigenetic inactivation of gene expressiondistinct from the role of developmental regulation.

Many of the domains present in the PcG proteins that have been clonedare implicated in protein-protein interactions. The esc and E(z)proteins have been shown to interact with each other in a yeast twohybrid system and through in vitro binding assays (Jones et al., CellBiol., 18(5):2825-34 (1998)). Homotypic and heterotypic interactionsbased on the SPM domain have been documented for Sex combs on midleg(Scm) and ph (Bornemann et al., Development, 122(5):1621-30 (1996);Peterson et al., Mol. Cell Biol., 17(11):6683-92 (1997)). The Xenopus Pchomolog, Xpc, forms complexes with itself and Bmi-1 (a psc homolog)(Reijnen et al., Mech. Dev., 53(1):35-46 (1995)). In other yeasttwo-hybrid screens, ph interacts with itself and with Psc, and Pscinteracts with Pc (Pirotta, V., Curr. Opin. Gen. Dev., 7(2):249-58(1997)). These results indicate the presence of a large complex formedby PcG proteins that is formed based on multiple protein-proteininteractions among various PcG members.

Recent evidence suggests that PcG proteins actually form two distinctcomplexes. One complex contains E(z) and esc which have been found todirectly interact (van Lohuizen et al., Mol. Cell Biol., 18(6):3572-9(1998); Jones et al., Mol. Cell Biol., 18(5):2825-34 (1998), Sewalt etal., Mol. Cell Biol., 18(6):3586-95 (1998); Ng et al., Mol. Cell Biol.,20(9):3069-78 (2000)). The second complex is the PRC1 complex (whichincludes Pc/Ph/Scm/Psc).

Homologs from PcG proteins have been characterized in a number ofspecies. Vertebrates appear to contain the most homologs of PcG proteins(Simon, Current Opinion in Cell Biology, 7(3):376-85 (1995)). Homologsof psc, Pc, ph, E(z) and esc have been cloned in mammals. The role ofPcG proteins in mammals is believed to be very similar to the role inDrosophila.

While many of the domains present in PcG proteins are found in yeastproteins, no PcG homologs are present in the S. cerevisiae genome. In C.elegans and Arabidopsis, homologs of two PcG proteins, E(z) and esc arefound. A SET domain and a cys-rich region are found in E(z) (Carringtonet al., Development, 122(12):4073-83 (1996); Jones et al., Genetics,126(1):185-99 (1990); Jones, RS, et al., Mol. Cell. Biol.,13(10):6357-66 (1993)). The esc proteins contain a series of seven WD-40repeats (Gutjahr et al., EMBO J., 14(17):4296-306 (1995); Simon et al.,Mech. Devt., 53(2):197-208 (1995)).

The E(z) and esc homologs (maternal effect sterile-2 (mes-2) andmaternal effect sterile-6 (mes-6)) from C. elegans were identified in ascreen for maternal-effect mutations that result in sterile offspring(Holdeman et al., Development, 125(13):2457-67 (1998), Korf et al.,Development, 125(13):2469-78 (1998)). The mes-2 and mes-6 genes areimplicated as maternal genes required for germline immortality. Bothmes-2 and mes-6 are localized to the nucleus of all embryonic cells andthe nuclei of germline cells in larvae and adults. This localization isdependent upon each other and another protein, mes-3 (Holdeman et al.,Development, 125(13):2457-67 (1998), Korf et al., Development,125(13):2469-78 (1998)). Transgene arrays in the C. elegans genome arefrequently silenced in germline cells (Kelly et al., Development,125(13):2451-6 (1998)). Mutations in mes-2 and mes-6 completelyalleviate silencing of transgenes in the germline cells (Kelly et al.,Development, 125(13):2451-6 (1998). These results suggest that the PcGproteins of C. elegans, mes-2 and mes-6 are involved in transcriptionalrepression specifically in the germline cells. It is likely that mes-2and mes-6 repress transcription of genes that would lead to adifferentiated state.

Arabidopsis also contains homologs of E(z) and esc (Goodrich et al.,Nature, 386(6620):44-51 (1997)), Grossniklaus et al., Science,280(5362):446-50 (1998); Ohad et al., Plant Cell, 11 (3):407-16 (1999)).Arabidopsis contains three E(z)-like genes, curly leaf (clf), Medea(Mea) and E(z)-likeA1 (EZA1) and one esc homolog,fertilization-independent endosperm (FIE1).

Clf mutants display curled leaves, altered maturation times and partialhomeotic transformations of floral tissues (Goodrich et al., Nature,386(6620):44-51 (1997)). Ectopic expression is also observed for thehometoic genes Agamous (AG) and Apetela3 (AP3). These genes arespecifically expressed in floral tissues where clf mRNA is also present.This indicates that, similar to the Drosophila PcG proteins, thepresence of CLF protein is not sufficient to repress AG and AP3transcription but requires targeting factors (Goodrich et al., Nature,386(6620):44-51 (1997)). The homeotic genes AG and AP3 are alsoectopicly expressed in Arabidopsis plants with reduced methylationlevels (Finnegan et al., Proc. Natl. Acad. Sci. USA, 93(16):8449-8454(1996)).

Medea was identified in a screen for Arabidopsis gametophyte lethalmutations (Grossniklaus et al., Science, 280(5362):446-50 (1998);Chaudhury et al., Proc. Natl. Acad. Sci., USA, 94(8):4223-8 (1997); Luoet al., Proc. Natl. Acad. Sci. USA, 96(1):296-301 (1999)). A plantheterozygous for mea mutations will produce 50% aborted seeds thatcollapse and do not germinate. Subsequently it has been found that MEAexhibits an imprinted pattern of gene expression (Kinoshita et al.,Plant Cell, 11(10):1945-52 (1999)); Vielle-Calzada et al., Genes Dev.,13(22):2971-82 (1999)). The maternal copy of Medea is expressed whilethe paternal copy is not. Medea mutants will allow endosperm developmentto occur in the absence of fertilization (Kiyosue et al., Proc. Natl.Acad. Sci. USA, 96(7):4186-91 (1999)). These results indicate thatmaternal expression of Medea is required to repress endospermdevelopment. Due to the early lethality of Medea mutants, roles forMedea later in plant development have not been determined. A thirdE(z)-like gene, EZA1 is present in the Arabidopsis genome (Preuss, D.,Plant Cell., 11(5):765-8 (1999)). Presently, the function of EZA1 isunknown.

Mutations in the Arabidopsis esc-like gene, FIE, have phenotypes similarto Medea. A female gametophyte with a FIE mutant allele will undergoreplication of the central cell nucleus and endosperm developmentwithout a fertilization event (Ohad et al., Plant Cell, 1 1(3):407-16(1999)). This indicates that FIE is critical in the repression ofendosperm development. As with Medea, due to the early lethality of FIEmutants, the role of FIE in later developmental events has not beendetermined. The similar phenotypes of FIE and mea mutants suggests thatthese two genes may interact functionally like E(z) and esc homologs inother organisms.

SUMMARY OF THE INVENTION

In one embodiment, the present invention relates to an isolated andpurified nucleic acid comprising a polynucleotide selected from thegroup consisting of SEQ ID NO:1, SEQ ID NO:3 and conservatively modifiedand polymorphic variants thereof. In addition, the present inventionrelates to an isolated and purified nucleic acid comprising apolynucleotide having at least 60%, 70%, 80%, 90%, or 95% identity to apolynucleotide selected from the group consisting of SEQ ID NO:1 and SEQID NO:3.

In yet another embodiment, the present invention relates to an isolatedand purified polypeptide comprising an amino acid sequence selected fromthe group consisting of SEQ ID NO:2, SEQ ID NO:4 and conservativelymodified variants thereof. In addition, the present invention relates toan isolated and purified polypeptide comprising an amino acid sequencehaving at least 60%, 70%, 80% or 95% identity to an amino acid sequenceselected from the group consisting of: SEQ ID NO:2 and SEQ ID NO:4.

In yet a further embodiment, the present invention relates to anexpression cassette containing a promoter sequence operably linked to anisolated and purified nucleic acid comprising a polynucleotide selectedfrom the group consisting of SEQ ID NO:1, SEQ ID NO:3 and conservativelymodified and polymorphic variants thereof. Preferably, the expressioncassette also contains a polyadenylation signal which is operably linkedto the previously described nucleic acid. Examples of promoters whichcan be used in the expression cassette include constitutive and tissuespecific promoters.

In yet another embodiment, the present invention relates to a bacterialcell containing the hereinbefore described expression cassette. Thebacterial cell can be an Agrobacterium tumefaciens cell or anAgrobacterium rhizogenes cell.

In still yet another embodiment, the present invention relates to aplant cell transformed with the hereinbefore described expressioncassette, a transformed plant containing such a plant cell, and to seedobtained from such a transformed plant. The plant cell, transformedplant and seed can be from Zea mays L.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows the Mez1 polynucleotide and amino acid sequences. FIG. 1Ashows that the polynucleotide sequence of the Mez1 cDNA is 3180 basepairs (bp). A solid underline indicates that the putative start codonand the first in-frame stop codon is indicated with a wavy underline.FIG. 1B shows the 931 amino acid Mez1 protein.

FIG. 2 shows the Mez2 polynucleotide and amino acid sequences. FIG. 1Ashows that the polynucleotide sequence of the Mez2 cDNA is 3030 bp. Theputative start codon is indicated by a solid underline while the stopcodon is indicated by a wavy underline. The location of several intronsis indicated by open arrowheads above the sequence. These introns wereidentified by sequencing of PCR products amplified from genomic DNAcorresponding to bp2032 to bp2587 of the cDNA. The location of the fourMu insertions are indicated by black arrowheads below the sequence. TheMez2-Mu1 allele contains a Mu element inserted into intron 1. Thelocation of the Mez2-Mu2, Mez2-Mu3 and Mez2-Mu4 Mu insertions are alllocated in exons. The nucleotides that flank the sequence that isremoved by alternative splicing are indicated by a double underline.FIG. 1B shows the 893 amino acid Mez2 protein.

FIG. 3 shows the alignment of Mez1 and Mez2. The Mez1 and Mez2 proteinsequences were aligned using ClustalW(http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/clustalw.html).These alignments were then processed using Boxshade to highlightidentical residues in black and similar residues in gray. The twoproteins are 42% identical and 56% similar over their entire length.

FIG. 4 shows the alignment of E(z) sequences. The sequences ofDrosophila E(z) (AAC46462), human EZH1 (AAC50778), human EZH2(AAC51520), C. elegans MES-2 (AAC27124), Arabidopsis CLF (AAC23781),Arabidopsis MEA (AAC39446), Arabidopsis EZA1 (AAD09108), Mez1 and Mez2were aligned using ClustalW(http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/clustalw.html).The alignments were colored using Boxshade to highlight identicalresidues in black and conserved residues in gray. The location of aputative bipartite nuclear localization signal in the plant sequences isindicated by *'s above the alignments. # symbols are located above thecysteine-rich region. The N-terminal SET domain is indicated by +symbols above the alignment. A putative SANT DNA binding domain is shownwith ˆ symbols. $ symbols are placed above all acidic amino acidresidues in an acidic region near the C-terminus. A region of highconservation in the plant sequences only containing a CRRC sequence isshown with x's above the alignment. The region between the CRRC domainand the nuclear localization signal is very divergent.

FIG. 5 shows schematic diagrams of E(z)-like proteins. E(z)-likeproteins from plants and the Drosophila E(z) are represented byrectangles with the N-terminus located on the left for each protein. Thelocation of the EZD1, EZD2, SANT, Cys-rich, and SET domains areindicated by shading.

FIG. 6 shows the alignment of the SET domains from Drosophila E(z)(AAC46462), human EZH1. (AAC50778), human EZH2 (AAC51520), C. elegansmes-2 (AAC27124), Arabidopsis clf (AAC23781), Arabidopsis Mea(AAC39446), Arabidopsis EZA1 (AAD09108), Mez1 and Mez2 using ClustalW(region indicated by [] in FIG. 4). The Arabidopsis sequences areunderlined. The maize sequences are in bold text. Bootstrap values areindicated by the numbers at nodes in the tree. Only nodes with bootstrapvalues greater than 50% are shown.

FIG. 7 shows that the Mez2 transcript is alternatively spliced indifferent tissues. Three predominant transcripts are found, the fulllength transcript and two smaller transcripts. The two smallertranscripts were isolated and sequenced to reveal the difference betweenthe transcripts. The MEZ2^(a.s.1) transcript is lacking base pairs 1016to 1676 and translation of this sequence results in a truncated proteinof 341 amino acids lacking the conserved C-terminal domains. TheMEZ2^(a.s.2) transcript is lacking base pairs 1016 to 1827 andtranslation of this sequence results in a 624 amino acid protein thatlacks the large variable region from the middle of the MEZ2 protein. TheMEZ2^(a.s.2) transcript has been found as the predominant transcript inembryo and endosperm tissues.

FIG. 8 shows the results of a RT-PCR analysis of Mez1 and Mez2expression pattern. In FIG. 8A, the primer pair Mez1F1-Mez1R1 was usedto amplify 2 ng of cDNA from various maize tissues. The PCR productswere then separated on a 1% agarose gel stained with ethidium bromide.The arrow indicates the expected size of the PCR product. In FIG. 8B,the primer pair Mez2F4-Mez2R8 was used to amplify 2 ng of cDNA fromvarious maize tissues. The arrows indicate the expected size of Mez2,Mez2^(as1) and Mez2^(as2) isoforms. In FIG. 8C, ubiquitin primers wereused to amplify 0.2 ng of cDNA from the same maize tissues as a control.The pollen cDNA did not allow the amplification of significant amountsof product indicating that the results using this cDNA are questionable.

DEFINITIONS

Units, prefixes, and symbols can be denoted in the SI accepted form.Numeric ranges are inclusive of the numbers defining the range. Unlessotherwise indicated, nucleic acids are written left to right in 5′ to 3′orientation, respectively. The headings provided herein are notlimitations of the various aspects or embodiments of the invention whichcan be had by reference to the specification as a whole. Accordingly,the terms defined immediately below are more fully defined by referenceto the specification as a whole.

As used herein, the terms “amplify” or “amplified” as usedinterchangeably herein refer to the construction of multiple copies of anucleic acid sequence or multiple copies complementary to the nucleicacid sequence using at least one of the nucleic acid sequences as atemplate. Amplification methods include the polymerase chain reaction(hereinafter “PCR”; described in U.S. Pat. Nos. 4,683,195 and4,683,202), the ligase chain reaction (hereinafter “LCR”; described inEP-A-320,308 and EP-A-439,182), the transcription-based amplificationsystem (hereinafter “TAS”), nucleic acid sequence based amplification(hereinafter “NASBA”, Cangene, Mississauga, Ontario; described in Proc.Natl. Acad. Sci., USA, 87:1874-1878 (1990); Nature, 350 (No. 6313):91-92(1991)), Q-Beta Replicase systems, and strand displacement amplification(hereinafter “SDA”). The product of amplification is referred to as anamplicon.

As used herein, the term “antibody” includes reference to animmunoglobulin molecule obtained by in vitro or in vivo generation of ahumoral response, and includes both polyclonal and monoclonalantibodies. The term also includes genetically engineered forms such aschimeric antibodies (e.g., humanized murine antibodies), heteroconjugateantibodies (e.g., bispecific antibodies), and recombinant single chainFc fragments (hereinafter “scFc”). The term “antibody” also includesantigen binding forms of antibodies (e.g., Fab¹, F(ab¹)₂, Fab, Fc, and,inverted IgG (See, Pierce Catalog and Handbook, (1994-1995) PierceChemical Co., Rockford, Ill.)). An antibody immunologically reactivewith a particular antigen can be generated in vivo or by recombinantmethods such as by the selection of libraries of recombinant antibodiesin phage or similar vectors (See, e.g. Huse et al., Science,246:1275-1281 (1989); and Ward, et al., Nature, 341:544-546 (1989); andVaughan et al., Nature Biotechnology, 14:309-314 (1996)).

As used herein, the term “antisense RNA” means an RNA sequence which iscomplementary to a sequence of bases in the mRNA in question in thesense that each base (or the majority of bases) in the antisensesequence (read in the 3′ to 5′ sense) is capable of pairing with thecorresponding base (G with C, A with U) in the mRNA sequence read in the5′ to 3′ sense.

As used herein, the term “conservatively modified variants” applies toboth amino acid and nucleic acid sequences. With respect to particularnucleic acid sequences, conservatively modified variants refers to thosenucleic acids which encode identical or conservatively modified variantsof the amino acid sequences. Because of the degeneracy of the geneticcode, a large number of functionally identical nucleic acids encode anygiven protein. For example, the codons GCA, GCC, GCG and GCU all encodethe amino acid alanine. Thereupon, at every position where an alanine isspecified by a codon, the codon can be altered to any of thecorresponding codons described without altering the encoded polypeptide.Such nucleic acid variations are “silent variations” and represent onespecies of conservatively modified variation. Every nucleic acidsequence herein which encodes a polypeptide also describes everypossible “silent variation” of the nucleic acid. It is known by personsskilled in the art that each codon in a nucleic acid (except AUG, whichis the only codon for the amino acid, methionine; and UGG, which is theonly codon for the amino acid tryptophan) can be modified to yield afunctionally identical molecule. Therefore, each silent variation of anucleic acid which encodes a polypeptide of the present invention isimplicit in each described polypeptide sequence.

With respect to amino acid sequences, persons skilled in the art willrecognize that individual substitutions, deletions or additions to anucleic acid, peptide, polypeptide, or protein sequence which alters,adds or deletes a single amino acid or a small percentage of amino acidsin the encoded sequence is a “conservatively modified variant” where thealteration results in the substitution of an amino acid with achemically similar amino acid. Conservative substitution tablesproviding functionally similar amino acids are well known in the art.

The following six groups each contain amino acids that are conservativesubstitutions for one another:

-   1) Alanine (A), Serine (S), Threonine (T);-   2) Aspartic acid (D), Glutamic acid (E);-   3) Asparagine (N), Glutamine (Q);-   4) Arginine (R), Lysine (K);-   5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and-   6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also,    Creighton (1984) Proteins W. H. Freeman and Company.

As used herein, the term “constitutive promoter” refers to a promoterwhich is active under most environmental conditions.

As used herein, the term “full length” when used in connection with aspecified polynucleotide or encoded protein refers to having the entireamino acid sequence of, a native (i.e. non-synthetic), endogenous,catalytically active form of the specified protein. Methods fordetermine whether a sequence is full length are well known in the art.Examples of such methods which can be used include Northern or Westernblots, primer extension, etc. Additionally, comparison to knownfull-length homologous sequences can also be used to identify fulllength sequences of the present invention.

As used herein, the term “heterologous” when used to describe nucleicacids or polypeptides refers to nucleic acids or polypeptides thatoriginate from a foreign species, or, if from the same species, aresubstantially modified from their original form. For example, a promoteroperably linked to a heterologous structural gene is from a speciesdifferent from that from which the structural gene was derived, or, iffrom the same species, is different from any naturally occurring allelicvariants.

The term “immunologically reactive conditions” as used herein, includesreference to conditions which allow an antibody, generated to aparticular epitope of an antigen, to bind to that epitope to adetectably greater degree than the antibody binds to substantially allother epitopes, generally at least two times above background binding,preferably at least five times above background. Immunologicallyreactive conditions are dependent upon the format of the antibodybinding reaction and typically are those utilized in immunoassayprotocols.

As used herein, the term “inducible promoter” refers to a promoter whichis under environmental control. Examples of environmental conditionsthat may effect transcription by inducible promoters include anaerobicconditions or the presence of light.

As used herein, the term “isolated” includes reference to material whichis substantially or essentially free from components which normallyaccompany or interact with it as found in its naturally occurringenvironment. The isolated material optionally comprises material notfound with the material in its natural environment. However, if thematerial is in its natural environment, the material has beensynthetically, (e.g. non-naturally) altered by deliberate humanintervention to a composition and/or placed in a locus in a cell (e.g.,genome or subcellular organelle) not native to a material found in thatenvironment.

Two polynucleotides or polypeptides are said to be “identical” if thesequence of nucleotides or amino acid residues, respectively, in the twosequences is the same when aligned (either manually for visualinspection or via the use of a computer algorithm or program) formaximum correspondence as described below. The terms “identical” or“percent identity” when used in the context of two or morepolynucleotide or polypeptide sequences, refer to two or more sequencesor subsequences that are the same or have a specified percentage ofamino acid residues or nucleotides that are the same, when compared andaligned for maximum correspondence over a comparison window, as measuredusing one of the following sequence comparison algorithms or by manualalignment and visual inspection. With respect to polypeptides orproteins having a “percent identity” or “percentage of sequenceidentity” one skilled in the art would recognize that residue positionsthat are not identical often differ by conservative amino acidsubstitutions, where amino acid residues are substituted for other aminoacid residues possessing similar chemical and/or physical propertiessuch as charge or hydrophobicity and therefore do not change thefunctional properties of the molecule. Where sequences differ inconservative substitutions, the percent sequence identity may beadjusted upwards to correct for the conservative nature of thesubstitution. Means for making this adjustment are well-known to personsskilled in the art. Typically this involves scoring a conservativesubstitution as a partial rather than a full mismatch, therebyincreasing the percentage sequence identity.

As used herein, the term “comparison window” includes reference to acontiguous and specified segment of a polynucleotide sequence, whereinthe polynucleotide sequence may be compared to a reference sequence andwherein the portion of the polynucleotide sequence in the comparisonwindow may comprise additions or deletions (e.g., gaps) compared to thereference sequence (which does not comprise additions or deletions) foroptimal alignment of the two sequences. Generally, the comparison windowis at least 20 contiguous nucleotides in length, and can be 30, 40, 50,100, or even longer. Persons skilled in the art will recognize that toavoid a high similarity to a reference sequence due to inclusion of gapsin the polynucleotide sequence a gap penalty is typically introduced andis subtracted from the number of matches.

The alignment of polynucleotide and/or polypeptide sequences for thepurposes of determine sequence identity and similarity can be by eithermanual alignment and visual inspection or via the use of some type ofcomputer program or algorithm. In fact, a number of computer programsare available which can be used to align polynucleotide and/orpolypeptide sequences are known in the art. For example, the programsavailable in the Wisconsin Sequence Analysis Package, Version 9(available from the Genetics Computer Group, Madison, Wis., 52711), suchas GAP, BESTFIT, FASTA and TFASTA. For example, the GAP program iscapable of calculating both the identity and similarity between twopolynucleotide or two polypeptide sequences. Specifically, the GAPprogram uses the homology alignment algorithm of Needleman and Wunsch(J. Mol. Biol., 48:443-453 (1970)). Another example of a useful computerprogram is PILEUP. PILEUP creates a multiple sequence alignment from agroup of related sequences using progressive, pairwise alignments toshow relationship and percent sequence identity. It also plots a tree ordendogram showing the clustering relationships used to create thealignment. PILEUP uses a simplification of the progressive alignmentmethod of Feng & Doolittle, J. Mol. Evol., 35:351-360 (1987). Yetanother example of a useful computer program that can be used fordetermine percent sequence identity and sequence similarity is the BLASTalgorithm (Altsuchul et al., J. Mol. Biol., 215:403-410 (1990)). Thesoftware for performing BLAST analysis is publicly available through theNational Center for Biotechnology Information(http:\\www.ncbi.nlm.nih.gov/).

With respect to polynucleotide sequences, the term “substantialidentity” means that a polynucleotide comprises a sequence that has atleast 60% sequence identity, preferably at least 70% sequence identity,more preferably at least 80% sequence identity, even more preferably 90%sequence identity and most preferably at least 90% sequence identity,compared to a reference sequence using one of the alignment programsdescribed herein conducted according to standard parameters. One skilledin the art will recognize that these values can be appropriatelyadjusted to determine corresponding identity of proteins encoded by twonucleotide sequences by taking into account codon degeneracy, amino acidsimilarity, reading frame positioning and the like. Substantial identityof amino acid sequences for these purposes normally means sequenceidentity of at least 60%, more preferably at least 70%, 80%, 90%identity, and most preferably at least 95% identity.

Polynucleotide sequences can also be considered to be substantiallyidentical if two molecules hybridize to each other under stringentconditions. However, polynucleotides which do not hybridize to eachother under stringent conditions are still substantially identical ifthe polypeptides which they encode are substantially identical. This canoccur when a copy of a polynucleotide is created using the maximum codondegeneracy permitted by the genetic code. One indication that twopolynucleotide sequences are substantially identical if the polypeptideencoded by the first nucleic acid encodes is immunologically crossreactive with the polypeptide encoded by the second polynucleotide.

With peptides, the term “substantial identity” as used herein means thata peptide comprises a sequence having at least 60% sequence identity toa reference sequence, preferably 70% sequence identity, more preferably80% sequence identity, even more preferably 90% sequence identity, andmost preferably at least 95% sequence identity to the reference sequenceover a specified comparison window. Preferably, optimal alignment isconducted using the homology alignment algorithm (GAP program discussedpreviously) of Needleman and Wunsch, J. Mol. Biol., 48:443-453 (1990).An indication that two peptide sequences are substantially identical isthat one peptide is immunologically reactive with antibodies raisedagainst the second peptide. Thereupon, a peptide is substantiallyidentical to a second peptide where the two peptides differ only by aconservative substitution. Peptides which are “substantially similar”share sequences as described above except that any residue positionswhich are not identical differ only by conservative amino acid changes.

As used herein, the term “Mez1 gene” refers to a gene of the presentinvention, specifically, the heterologous genomic form of a full lengthMez1 polynucleotide.

As used herein, the term “Mez1 nucleic acid” refers to a nucleic acid ofthe present invention, specifically, a nucleic acid comprising apolynucleotide of the present invention encoding a Mez1 polypeptide(hereinafter “Mez1 polynucleotide”). An example of a Mez1 polynucleotide(cDNA) is shown in SEQ ID NO:1.

As used herein, the terms “Mez1 polypeptide”, “Mez1 peptide” or “Mez1protein” as used interchangeable herein refer to a polypeptide shown inSEQ ID NO:2. The term also includes fragments, variants, homologs,alleles or precursors (e.g., preproproteins or proproteins) thereof.

As used herein, the term “Mez2 gene” refers to a gene of the presentinvention, specifically, the heterologous genomic form of a full lengthMez2 polynucleotide.

As used herein, the term “Mez2 nucleic acid” refers to a nucleic acid ofthe present invention, specifically, a nucleic acid comprising apolynucleotide of the present invention encoding a Mez2 polypeptide(hereinafter a “Mez2 polynucleotide”). An example of a Mez2polynucleotide (cDNA) is shown in SEQ ID NO:3.

As used herein, the terms “Mez2 polypeptide”, “Mez2 peptide” or “Mez2protein” as used interchangeably herein refer to a polypeptide shown inSEQ ID NO:4. The term also includes fragments, variants, homologs,alleles or precursors (e.g., preproproteins or proproteins) thereof. A“Mez2 protein” is a protein of the present invention and comprises aMez2 polypeptide.

As used herein, the term “nucleic acid” refers to a deoxyribonucleotideor ribonucleotide polymer in either single-or double-stranded form, andunless otherwise limited, encompasses known analogues having theessential nature of natural nucleotides in that they hybridize tosingle-stranded nucleic acids in a manner similar to naturally occurringnucleotides (e.g., peptide nucleic acids).

As used herein, the term “nucleotide(s)” refers to a macromoleculecontaining a sugar (either a ribose or deoxyribose), a phosphate groupand a nitrogenous base.

As used herein, the term “operably linked” includes reference to afunctional linkage between a promoter and a second sequence, wherein thepromoter sequence initiates and mediates transcription of the DNAsequence corresponding to the second sequence. Generally, operablylinked means that the polynucleotide sequences being linked arecontiguous and, where necessary to joint two protein coding regions,contiguous and in the same reading frame.

As used herein, the term “plant” includes reference to whole plants,plant organs (e.g., leaves, stems, flowers, roots, etc.), seeds andplant cells and progeny of the same. Plant cell, as used herein,includes, but is not limited to, suspension cultures, embryos,meristematic regions, callus tissue, shoots, gametophytes, sporophytes,pollen and microspores. The class of plants which can be used in themethods of the present invention are generally as broad as the class ofhigher plants amenable to transformation techniques, includingangiosperms (monocotyledonous and dicotyledonous plants) as well asgymnosperns (e.g. Coniferophyta (conifers, Cycadophyta (cycads),Ginkgophyta (maidenhair tree) and Gnetophyta (gnetophytes)). The term“plant” as used herein also includes plants of a variety of ploidylevels, such as polyploid, diploid, haploid and hemizygous.

As used herein, the term “plant promoter” refers to a promoter capableof initiating transcription in plant cells.

As used herein, the term “polymorphic variant” in connection with apolynucleotide sequence refers to a variation in the polynucleotidesequence of a particular gene between individuals of a given species.Polymorphic variants may also encompass “single nucleotidepolymorphisms” (SNPs) in which the polynucleotide sequence varies by onebase. The presence of SNPs may be indicative of a certain population fora disease state or propensity for a disease state.

As used herein, the term “polynucleotide” refers to adeoxyribopolynucleotide, ribopolynucleotide, or analogs thereof thathave the essential nature of a natural ribonucleotide in that theyhybridize, under stringent hybridization conditions, to substantiallythe same nucleotide sequence as naturally occurring nucleotides and/orallow translation into the same amino acid(s) as the naturally occurringnucleotide(s). A polynucleotide can be full length or a subsequence of anative or heterologous structural or regulatory gene. Unless otherwiseindicated, the term includes reference to the specified sequence as wellas the complementary sequence thereof. Thereupon, DNAs or RNAs withbackbones modified for stability or for other reasons are“polynucleotides” as that term is intended herein. Moreover, DNAs orRNAs comprising unusual bases, such as inosine, or modified bases, suchas tritylated bases, to name just two examples, are polynucleotides asthe term is used herein. As used herein, the term polynucleotideincludes such chemically, enzymatically or metabolically modified formsof polynucleotides, as well as the chemical forms of DNA and RNAcharacteristic of viruses and cells, including, but not limited to,simple and complex cells.

As used herein, the terms “polypeptide”, “peptide” and “protein” areused interchangeably herein to refer to a polymer of amino acidresidues. The terms apply to amino acid polymers in which one or moreamino acid residue is an artificial chemical analogue of a correspondingnaturally occurring amino acid, as well as to naturally occurring aminoacid polymers. The essential nature of such analogues of naturallyoccurring amino acids is that, when incorporated into a protein, thatprotein is specifically reactive to antibodies elicited to the sameprotein but consisting entirely of naturally occurring amino acids. Theterms “polypeptide ”, “peptide” and “protein” are also inclusive ofmodifications including, but not limited to, glycosylation, lipidattachment, sulfation, gamma-carboxylation of glutamic acid residues,hydroxylation and ADP-ribosylation.

As used herein, the term “promoter” refers to a region of DNA upstreamfrom the start of transcription and involved in recognition and bindingof RNA polymerase and other proteins to initiate transcription. Apromoter can optionally include distal enhancers or repressor elementswhich can be located several thousand base pairs from the start site oftranscription.

As used herein, the term “recombinant” includes reference to a cell, ornucleic acid, or vector, that has been modified by the introduction of aheterologous nucleic acid or the alteration of a native nucleic acid toa form not native to that cell, or that the cell is derived from a cellso modified. For example, recombinant cells express genes that are notfound within the native (non-recombinant) form of the cell or expressnative genes that are otherwise abnormally expressed, under expressed ornot expressed at all.

As used herein, the term “recombinant expression cassette” is a nucleicacid construct, generated recombinantly or synthetically, with a seriesof specified nucleic acid elements which permit transcription of aparticular nucleic acid in a target cell. The expression vector can bepart of a plasmid, virus, or nucleic acid fragment. Typically, therecombinant expression cassette portion of the expression vectorincludes a nucleic acid to be transcribed, and a promoter.

As used herein, the terms “residue” or “amino acid” or “amino acidresidue” are used interchangeably herein to refer to an amino acid thatis incorporated into a protein, polypeptide or peptide. The amino acidmay be a naturally occurring amino acid, and unless otherwise limited,may encompass known analogs of natural amino acids that can function ina similar manner as naturally occurring amino acids.

As used herein, the term “selective hybridization” or “selectivelyhybridizes” are used interchangeably herein includes reference tohybridization, under stringent hybridization conditions, of a nucleicacid sequence to a specified nucleic acid target sequence to adetectably greater degree (e.g., at least 2-fold over background) thanits hybridization to non-target nucleic acid sequences and to thesubstantial exclusion of non-target nucleic acids. Selectivelyhybridizing sequences typically have about at least 80% sequenceidentity, preferably 90% sequence identity, and most preferably 100%sequence identity (e.g., complementary) with each other.

As used herein, the term, “specifically binds” includes reference to thepreferential association of a ligand, in whole or part, with aparticular target molecule (i.e., “binding partner” or “binding moiety”relative to compositions lacking that target molecule). It is, ofcourse, recognized that a certain degree of non-specific interaction mayoccur between a ligand and a non-target molecule. Nevertheless, specificbinding, may be distinguished as mediated through specific recognitionof the target molecule. Typically, specific binding results in a muchstronger association between the ligand and the target molecule thanbetween the ligand and non-target molecule. Specific binding by anantibody to a protein under such conditions requires an antibody that isselected for its specificity for a particular protein. The affinityconstant of the antibody binding site for its cognate monovalent antigenis at least 10⁷, usually at least 10⁹, more preferably at least 10¹⁰,and most preferably at least 10¹¹ liters/mole.

As used herein, the terms “stringent hybridization” conditions or“stringent conditions” refers to conditions under which a probe willhybridize to its target subsequence, typically in a complex mixture ofnucleic acid, but to no other sequences. Stringent conditions aresequence dependent and are different under different environmentalparameters. An extensive guide to hybridization of nucleic acids isfound in Tijssen (1993) Laboratory Techniques in Biochemistry andMolecular Biology-Hybridization with Nucleic Acid Probes Part 1, Chapter2 “Overview of Principles of Hybridization and the Strategy of NucleicAcid Probe Assays” Elsevier, N.Y. Generally, highly stringent conditionsare selected to be about 5° C.-10° C. lower than the thermal meltingpoint (T_(m)) for the specific sequence at a defined ionic strength andpH. The T_(m) is the temperature (under defined ionic strength and pHand nucleic concentration) at which 50% of the target sequencehybridizes to a perfectly matched probe. Stringent conditions are thosein which the salt concentration is less than about 1.0 M sodium ion,typically about 0.01 to 1.0 M sodium ion concentration (or other salts)at a pH of 7.0 to 8.3 and at a temperature of at least about 30° C. forshort probes (such as those having a length between about 10 to 50nucleotides) and at least about 60° C. for long probes (such as thosehaving a length greater than 50 nucleotides). In contrast, lowstringency conditions are at about 15-30° C. below the T_(m). Stringenthybridization conditions are sequence-dependent and will be different indifferent circumstances. Longer sequences hybridize at highertemperatures.

As used herein, the term “tissue-specific promoter” includes referenceto a promoter in which expression of an operably linked gene is limitedto a particular tissue or tissues.

As used herein, the term “transgenic plant” includes reference to aplant modified by introduction of a heterologous polynucleotide.Generally, the heterologous polynucleotide is a Mez1 or Mez2 structuralor regulatory gene or subsequences thereof.

SEQUENCE LISTINGS

The present application also contains a sequence listing that containstwenty (20) sequences. The sequence listing contains nucleotidesequences and amino acid sequences. For the nucleotide sequences, thebase pairs are represented by the following base codes: Symbol Meaning AA; adenine C C; cytosine G G; guanine T T; thymine U U; uracil M A or CR A or G W A or T/U S C or G Y C or T/U K G or T/U V A or C or G; notT/U H A or C or T/U; not G D A or G or T/U; not C B C or G or T/U; not AN (A or C or G or T/U)

The amino acids shown in the application are in the L-form and arerepresented by the following amino acid-three letter abbreviations:Abbreviation Amino acid name Ala L-Alanine Arg L-Arginine AsnL-Asparagine Asp L-Aspartic Acid Asx L-Aspartic Acid or Asparagine CysL-Cysteine Glu L-Glutamic Acid Gln L-Glutamine Glx L-Glutamine orGlutamic Acid Gly L-Glycine His L-Histidine Ile L-Isoleucine LeuL-Leucine Lys L-Lysine Met L-Methionine Phe L-Phenylalanine ProL-Proline Ser L-Serine Thr L-Threonine Trp L-Tryptophan Tyr L-TyrosineVal L-Valine Xaa L-Unknown or otherIntroduction

The present invention is based, at least in part, on the discovery andcloning of two (2) PcG genes from Zea mays L. (maize) termed the Mez1gene and the Mez2 gene. The protein encoded by the Mez1 gene has beenmapped to chromosome 6 (bin 6.01-6.02) and the protein for the Mez2 genehas been mapped to chromosome 9 (bin 9.04).

The present invention is applicable to a broad range of types of plants,including, but not limited to, Zea mays L., Oryza sativa, Secalecereale, Triticum aestivum, Daucus carota, Brassica oleracea, Cucumismelo, Cucumis sativus, Latuca sativa, Solanum tubersoum, Lycopersiconesculentum, Phaseolus vulgaris, and Brassica napus.

Nucleic Acids

In one embodiment, the present invention relates to isolated nucleicacids of DNA, RNA, and analogs and/or chimeras thereof, comprising apolynucleotide, wherein said polynucleotide is a Mez1 or Mez2polynucleotide which encodes a polypeptide of SEQ ID NO:2 (a Mez1polypeptide) or SEQ ID NO:4 (a Mez2 polypeptide), and conservativelymodified variants thereof. It is known in the art that the degeneracy ofthe genetic code allows for a plurality of polynucleotides to encode forthe identical amino acid sequence. These “silent variations”, as theyare common referred to, can be used to selectively hybridize and detectpolymorphic variants of the polynucleotides of the present invention.

An example of a Mez1 polynucleotide which encodes the Mez1 polypeptideof SEQ ID NO:2 is shown in SEQ ID NO:1. The polynucleotide of SEQ IDNO:1 is 3180 base pairs in length.

An example of a Mez2 polynucleotide which encodes the Mez2 polypeptideof SEQ ID NO:4 is shown in SEQ ID NO:3. The polynucleotide of SEQ IDNO:3 is 3030 base pairs in length.

The Mez2 polynucleotide of SEQ ID NO:3, in addition to encoding for theMez2 polypeptide, contains two (2) alternative splice sites. Thesealternative splice sites are referred to herein as Mez2 alternativesplice 1 (“Mez2^(as1”)) (SEQ ID NO:5) and Mez2 alternative splice 2(“Mez2_(as2”)) (SEQ ID NO:6). The polynucleotide sequence of Mez2^(as1)(hereinafter Mez2^(as1) polynucleotide”) is identical to the Mez2polynucleotide of SEQ ID NO:3 except that Mez2^(as1) polynucleotide ismissing a fragment of 659 basepairs in length. Specifically, thisdeleted fragment corresponds to 1016 to 1676 in the Mez2 polynucleotideof SEQ ID NO:3. The Mez2^(as1) polynucleotide deletion causes aframeshift and a truncated protein of 341 amino acids which is missingthe SANT, nuclear localization signal, cysteine rich region and SETdomains (See FIG. 7).

The polynucleotide sequence of Mez2^(as2) (hereinafter Mez2^(as2)polynucleotide”) is identical to the Mez2 polynucleotide of SEQ ID NO:3except that Mez2^(as2) polynucleotide is missing a fragment of 810basepairs in length. Specifically, this deleted fragment corresponds to1016 to 1827 in the Mez2 polynucleotide of SEQ ID NO:3. The Mez2^(as2)polynucleotide deletion does not result in a frameshift. The deletion inMez^(as2) results in a 624 amino acid protein that is missing the SANTdomain (See FIG. 7).

In another embodiment, the present invention also provides isolated ofnucleic acids comprising polynucleotides encoding conservativelymodified variants of a Mez1 or Mez2 polypeptides of SEQ ID NOS:2 and 4.Such conservatively modified variants can be used for a number of usefulpurposes, such as, but not limited to, the generation or selection ofantibodies immunoreactive to the non-variant polypeptide. Also, in yetanother embodiment, the present invention also relates to isolatednucleic acids comprising polynucleotides encoding one or morepolymorphic variants of polypeptides/polynucleotides. Polymorphicvariants are used to follow the segregation of chromosome regions andare typically used in marker assisted selection methods for cropimprovement.

In another embodiment, the present invention relates to the isolationnucleic acids comprising polynucleotides of the present invention whichselectively hybridize, under selective hybridization conditions (i.e.stringent hybridization conditions), to the Mez1 or Mez2 polynucleotide.The isolation of such nucleic acids can be accomplished by a number oftechniques. For example, oligonucleotide probes based upon the Mez1 andMez2 polynucleotides described herein can be used to identify, isolateor amplify partial or full length clones in a deposited library (such asa cDNA or genomic DNA library). For example, a cDNA or genomic librarycan be screened using a probe based upon the sequence of the Mez1 orMez2 polynucleotides described herein. These probes can be used tohybridize with genomic DNA or cDNA sequences to isolate homologous genesin the same or different plant species.

Alternatively, nucleic acids of interest can be amplified from nucleicacid samples using various amplification techniques known in the art.For example, PCR can be used to amplify the sequences of the Mez1 orMez2 genes directly from genomic DNA, from cDNA, from genomic librariesor cDNA libraries. PCR and other in vitro amplification methods (such asLCR, etc.) can be used to clone nucleic acid sequences that code forproteins to be expressed, to make nucleic acids for use as probes fordetecting the presence of the desired mRNA in samples, for nucleic acidsequencing or for other purposes.

In yet another embodiment, the present invention relates to isolatednucleic acid comprising polynucleotides, wherein the polynucleotides ofsaid nucleic acid have a specified identity at the nucleotide level tothe previously described Mez1 or Mez2 polynucleotides. The percentage ofidentity is at least 60%, preferably 70%, more preferably 80%, even morepreferably 90% and most preferably 95%.

In yet another embodiment, the present invention relates to isolatednucleic acids comprising polynucleotides complementary to the previouslydescribed Mez1 or Mez2 polynucleotides. One skilled in the art willrecognize that complementary sequences will base pair throughout theirentire length with the previously described Mez1 or Mez2 polynucleotides(meaning that they have 100% sequence identity over their entirelength). Complementary bases associate through hydrogen bonding indouble stranded nucleic acids. Base pairs known to be complementaryinclude the following: adenine and thymine, guanine and cytosine andadenine and uracil.

In yet another embodiment, the present invention relates to isolatednucleic acids comprising polynucleotides which comprise at least 15contiguous bases from the previously described Mez1 or Mez2polynucleotides. More specifically, the length of the polynucleotidescan be from about 15 continguous bases to the length of the Mez1 or Mezpolynucleotide from which the polynucleotide is a subsequence of. Forexample, such polynucleotides can be 15, 35, 55, 75, 95, 100, 200, 400,500, 750, etc. continguous nucleotides in length from the previouslydescribed Mez1 or Mez2 polypeptide. In addition, such subsequences canoptionally comprise or lack certain structural characteristics from theMez1 or Mez2 polynucleotides from which it is derived.

Polypeptides

In one embodiment, the present invention relates to a Mez1 polypeptideof SEQ ID NO:2. The Mez1 polypeptide is 931 amino acids in length, has amolecular weight of about 103.75 kDa and an isoelectric point of 8.91.

In a second embodiment, the present invention relates to a Mez2polypeptide of SEQ ID NO:4. The Mez2 polypeptide is 893 amino acids inlength, has a molecular weight of about 100.01 kDa and an isoelectricpoint of 8.47.

The Mez1 and Mez2 polypeptides contain a number of domains. Thesedomains are: EZD1, EZD2, SANT domain, cysteine rich region and SETdomain (See, FIG. 5). The EZD1 and EZD2 regions are conserved domainsspecific to the E(z) family. EZD1 is a highly conserved acidic region of74 amino acids in the N-terminal region. The EZD1 domain contains asignificant proportion of charged residues (34-39%) with seven moreacidic residues than basic residues. The function of this domain ispresently not known. The EZD1 is highly conserved between Mez1, Mez2 clfand EZA1. EZD2 is a small, highly conserved region of 44 amino acidsnear amino acid 250 of the plant and animal E(z)-like proteins. The EZD2region is composed primarily of polar or charged residues. There are two(2) regions near the C-terminus of these protein are well conservedamong all E(z) proteins (See FIG. 5). These are the cysteine rich regionand the SET domain. The Cys-rich region has fiften invariant cysteineresidues with a conserved spacing pttern in all E(z) homologs. Thespacing of the cystein residues in all E(z) homologs is unique and isdifferent from other Cys-rich zinc finger domains involved in DNAbinding. The function of the cysteine rich domain is not known but it ishighly conserved among all E(z)-like genes. The SET domain is alsohighly conserved and is believed to be involved in mediatingprotein-protein interactions (Cui et al., Nat. Genet., 18:331-337(1998); Huang et al., J. Biol. Chem., 273:15933-15939 (1998)). The SANTbinding domain is often invovled in non-specific DNA binding (Aasland,R., et al., Trends Biochem. Sci., 21(3):8-88 (1996)).

In another embodiment, the present invention relates to a polypeptidehaving a specified percentage of sequence identity with the Mez1 or Mez2polypeptide of the present invention. The percentage of sequenceidentity is at least 60%, preferably 70%, more preferably 80%, even morepreferably 90% and most preferably 95%.

The present invention also provides antibodies which specifically reactwith the Mez1 or Mez2 polypeptides of the present invention underimmunologically reactive conditions. An antibody immunologicallyreactive with a particular antigen can be generated in vivo or byrecombinant methods such as by selection of libraries of recombinantantibodies in phage or similar vectors.

Many methods of making antibodies are known to persons skilled in theart. A number of immunogens can be used to produce antibodiesspecifically reactive to the isolated Mez1 or Mez2 polypeptides of thepresent invention under immunologically reactive conditions. An isolatedrecombinant, synthetic, or native isolated Mez1 or Mez2 polypeptide ofthe present invention is the preferred immunogens (antigen) for theproduction of monoclonal or polyclonal antibodies.

The Mez1 or Mez2 polypeptide can be injected into an animal capable ofproducing antibodies. Either monoclonal or polyclonal antibodies can begenerated for subsequent use in immunoassays to measure the presence andquantity of the Mez1 or Mez2 polypeptide. Methods of producingmonoclonal or polyclonal antibodies are known to persons skilled in theart (See, Coligan, Current Protocols in Immunology Wiley/Greene, NY(1991); Harlow and Lane, Antibodies: A Laboratory Manual Cold SpringHarbor Press, NY (1989)); and Goding Monoclonal Antibodies: Principlesand Practice (2d ed.) Academic Press, New York, N.Y. (1986)).

The Mez1 or Mez2 polypeptides and antibodies can be labeled by joining,either covalently or non-covalently, a substance which provides for adetectable signal. A wide variety of labels and conjugation techniquesare known to persons skilled in the art. Suitable labels includeradionucleotides, enzymes, substrates, cofactors, inhibitors,fluorescent moieties, chemiluminescent moieties, magnetic particles, andthe like. Patents teaching the use of such labels include U.S. Pat. Nos.3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and4,366,241.

The antibodies of the present invention can be used to screen plants forthe expression of the Mez1 or Mez2 polypeptides of the presentinvention. The antibodies of the present invention can also be used foraffinity chromatography for the purpose of isolating Mez1 or Mez2polypeptides.

The present invention further provides Mez1 or Mez2 polypeptides thatspecifically bind, under immunologically reactive conditions, to anantibody generated against a defined immunogen, such as an immunogenconsisting of the Mez1 or Mez2 polypeptides. Immunogens will generallyhave a length of at least 10 contiguous amino acids from the Mez1 orMez2 polypeptides of the present invention, respectively.

A variety of immunoassay formats are appropriate for selectingantibodies specifically reactive with a particular protein. For example,solid-phase ELISA immunoassays are routinely used to select monoclonalantibodies specifically reactive with a protein (See Harlow and Lane,Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, NewYork (1988), for a description of immunoassay formats and conditionsthat can be used to determine specific reactivity). The antibody may bepolyclonal but preferably is monoclonal. Generally, antibodiescross-reactive to Mez1 or Mez2 polypeptides are removed byimmunoabsorbtion.

Immunoassays in the competitive binding format are typically used forcross-reactivity determinations. For example, an immunogenic Mez1 orMez2 polypeptide can be immobilized to a solid support. Polypeptidesadded to the assay compete with the binding of the antisera to theimmobilized antigen. The ability of the above polypeptides to competewith the binding of the antisera to the immobilized Mez1 or Mez2polypeptide is compared to the immunogenic Mez1 or Mez2 polypeptide. Thepercent cross-reactivity for the above proteins is calculated, usingstandard calculations known to persons skilled in the art.

The immunoabsorbed and pooled antisera are then used in a competitivebinding immunoassay to compare a second “target” polypeptide to theimmunogenic polypeptide. In order to make this comparison, the twopolypeptides are each assayed at a wide range of concentrations and theamount of each polypeptide required to inhibit 50% of the binding of theantisera to the immobilized protein is determined using standardtechniques. If the amount of the target polypeptide required is lessthan twice the amount of the immunogenic polypeptide that is required,then the target polypeptide is said to specifically bind to an antibodygenerated to the immunogenic protein. As a final determination ofspecificity, the pooled antisera is fully immunoabsorbed with theimmunogenic polypeptide until no binding to the polypeptide used in theimmunoabsorbtion is detectable. The fully immunoabsorbed antisera isthen tested for reactivity with the test polypeptide. If no reactivityis observed, then the test polypeptide is specifically bound by theantisera elicited by the immunogenic protein.

Production of Recombinant Expression Cassettes

Isolated nucleic acids of the present invention can be used inrecombinant expression cassettes. One of ordinary skill in the art willrecognize that a nucleic acid used in the recombinant expressioncassettes described herein encoding a functional Mez1 or Mez2polypeptide need not have a sequence identical to the exemplifiednucleic acids disclosed herein and does not need to be full length, solong as the desired functional domain of the Mez1 or Mez2 protein isexpressed.

A nucleic acid comprising a polynucleotide coding for the desiredfunctional Mez1 or Mez2 polypeptide, for example a cDNA or a genomicsequence encoding a full length protein, can be used to construct arecombinant expression cassette which can be introduced into a desiredplant. An expression cassette will typically comprise the functionalMez1 or Mez2 nucleic acid operably linked in either the sense orantisense direction to transcriptional and translational initiationregulatory sequences which will direct the transcription of the sequencefrom the functional Mez1 or Mez2 nucleic acid in the intended tissuesfor the transformed plant. Examples of transcriptional and translationalinitiation regions that can be used in the recombinant expressioncassette are well known in the art.

The recombinant expression cassette will contain a promoter which isused to direct expression of the polynucleotides of the presentinvention in one, more than one, or in all of the tissues of aregenerated plant. For example, a constitutive plant promoter may beemployed which will direct expression of the functional Mez1 or Mez2polypeptide in all tissues of a regenerated plant. Examples ofconstitutive promoters includes, but is not limited to, the cauliflowermosaic virus (hereinafter “CaMV”) 35S transcription initiation region,the NOS promoter, the RUBISCO promoter, the 1′ or 2′—promoter derivedfrom T-DNA of Agrobacterium tumefaciens, etc. The determination of asuitable constitutive plant promoter to be used in the recombinantexpression cassette can readily be determined by persons skilled in theart.

Alternatively, an inducible plant promoter can be used. An inducibleplant promoter may direct expression of the Mez1 or Mez2 nucleic acid inspecific tissue or under more precise environmental or developmentalcontrol in a regenerated plant. Examples of environmental conditionsthat may effect transcription by inducible promoters include pathogenattack, anaerobic conditions, or the presence of light. Examples ofinducible promoters include, but are not limited to, the Hsp70 promoter(which is inducible by heat stress), the PPDK promoter (which isinducible by light), etc.

Promoters derived from the Mez1 or Mez2 genes can be used to directexpression. These promoters can also be used to direct expression ofheterologous sequences. The promoters can be used, for example, inrecombinant expression cassettes to drive expression of the Mez1 or Mez2nucleic acids of the present invention or heterologous sequences.

Such promoters can be identified as follows. The 5 portions of the Mez1or Mez2 genes described herein are analyzed for sequences characteristicof promoter sequences. For instance, promoter sequence elements includethe TATA box consensus sequence (TATAAT), which is usually 20 to 30 basepairs upstream of the transcription start site. In plants, furtherupstream from the TATA box, at positions −80 to −100, there is typicallya promoter element with a series of adenines surrounding thetrinucleotide G (or T) N G. (See, J. Messing et al., in GeneticEngineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender,eds. 1983)).

If proper polypeptide expression is desired, a polyadenylation region atthe 3′-end of the Mez1 or Mez2 polynucleotide coding region should beincluded. The polyadenylation region can be derived from a natural gene,from a variety of other plant genes, or from T-DNA. For example,polyadenylation regions can be derived from the nopaline synthase oroctopine synthase genes.

The expression cassette comprising the Mez1 or Mez2 nucleic acids willtypically comprise one or more marker genes which confers a selectablephenotype on plant cells. For example, the marker gene can encodebiocide resistance, particularly antibiotic resistance, such asresistance to kanamycin, G418, bleomycin, hygromycin, or herbicideresistance, such as resistance to chlorosulforon.

As discussed briefly above, the Mez1 or Mez2 nucleic acids can beinserted into a recombinant expression cassette in the antisensedirection. Expression of the Mez1 or Mez2 nucleic acid in antisensedirection will result in the production of antisense RNA. It is wellknown to persons skilled in the art that a cell manufactures protein bytranscribing the DNA of the gene encoding a protein to produce RNA,which is then processed to messenger RNA (hereinafter “mRNA”) (e.g., bythe removal of introns) and finally translated by ribosomes intoprotein. This process may be inhibited in the cell by the presence ofantisense RNA. It is believed that this inhibition takes place byformation of a complex between the two complementary strands of RNA,thus preventing the formation of protein. It is presently unclear howthis mechanism works. However, it is believed that the complex mayinterfere with further translation, degrade the mRNA, or have more thanone of these effects. This antisense RNA can be produced in the cell bytransformation of the cell with an appropriate recombinant expressioncassette designed to transcribe the non-template strand (as opposed tothe template strand) of the relevant gene (or of a nucleic acid sequenceshowing substantial identity therewith).

The use of antisense RNA to downregulate the expression of specificplant genes is well known. Reduction in gene expression has beendetermined to led to changes in the phenotype of a plant, either at thelevel of gross visible phenotypic difference (see van der Krol et al.,Nature, 333:866-869 (1988)), or at a more subtle biochemical level(Smith et al., Nature, 334:724-726 (1988)). Another method forinhibiting gene expression in transgenic plants involves the use ofsense RNA transcribed from an exogenous template to downregulate theexpression of specific plant genes (See, Jorgensen, Keystone Symposium“Improved Crop and Plant Products through Biotechnology”, AbstractX1-022 (1994)). Thereupon, both antisense and sense RNA can be used toachieve downregulation of gene expression in plants, which areencompassed by the present invention.

Production of Transgenic Plants

Techniques for transforming a wide variety of higher plant species usingthe recombinant expression cassettes hereinbefore described are wellknown and described in the technical and scientific literature (See, forexample, Weising et al., Ann. Rev. Genet., 22:421-477 (1988)).

The hereinbefore described recombinant expression cassettes can beintroduced into the genome of a desired plant host by a variety ofconventional techniques which are well known to persons skilled in theart. For example, the recombinant expression cassette can be introduceddirectly into the genomic DNA of the plant cell using techniques such aselectroporation, PEG poration, particle bombardment, silicon fiberdelivery, and microinjection of plant cell protoplasts or embryogeniccallus, or the expression cassettes can be introduced directly to planttissue using ballistic methods, such as DNA particle bombardment.Alternatively, the expression cassettes may be combined with suitableT-DNA flanking regions and introduced into a conventional Agrobacteriumtumefaciens or Agrobacterium rhizogenes host vector. The virulencefunctions of the Agrobacterium host will direct the insertion of theexpression cassette and adjacent marker gene into the plant cell DNAwhen the cell is infected by the bacteria.

Plants which can be transformed with the recombinant expression cassetteof the present invention include, but are not limited to, Zea mays L.,Oryza sativa, Secale cereale, Triticum aestivum, Daucus carota, Brassicaoleracea, Cucumis melo, Cucumis sativus, Latuca sativa, Solanumtubersoum, Lycopersicon esculentum, Phaseolus vulgaris, Brassica napus,etc.

Transformation techniques are well known to persons skilled in the art.For example, the introduction of expression cassettes using polyethyleneglycol precipitation is described in Paszkowski et al., EMBO J.,3:2712-2722 (1984). Electroporation techniques are described in Fromm etal., Proc. Natl. Acad. Sci. USA, 82:5824 (1985). Biolistictransformation techniques are described in Klein et al., Nature,327:70-73 (1987).

Agrobacterium tumefaciens-mediated transformation techniques are wellknown to persons skilled in the art (See, for example Horsch et al.,Science 233:496-498 (1984), and Fraley et al., Proc. Natl. Acad. Sci.USA, 80:4803 (1983)). Although Agrobacterium is useful primarily indicots, certain monocots can be transformed by Agrobacterium. U.S. Pat.No. 5,550,318 describes Agrobacterium transformation of maize.

Moreover, the following methods of transfection or transformation canalso be used: (a) Agrobacterium rhizogenes-mediated transformation (See,Lichtenstein and Fuller In Genetic Engineering, vol. 6, PWJ Rigby, Ed.,London, Academic Press, (1987)); (b) liposome-mediated DNA uptake (See,Freeman et al., Plant Cell Physiol., 25:1353 (1984)); and (3) thevortexing method (See, Kindle, Proc. Natl. Acad. Sci. USA, 87:1228(1990)).

Transformed plant cells which are derived by any of the abovetransformation techniques can be cultured to regenerate a whole plantwhich possesses the transformed genotype. Such regeneration techniquesrely on manipulation of certain phytohormones in a tissue culture growthmedium, typically relying on a biocide and/or herbicide marker which hasbeen introduced together with the Mez1 or Mez2 nucleic acid. Plantregeneration from cultured protoplasts is described in Evans et al.,Protoplasts Isolation and Culture, Handbook of Plant Cell Culture, pp.124-176, MacMillian Publishing Company, New York, 1983; and Binding;Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, BocaRaton, 1985. Regeneration can also be obtained from plant callus,explants, organs, or parts thereof. Such regeneration techniques aredescribed generally in Klee et al., Ann. Ref of Plant Phys. 38:467-486(1987).

One of ordinary skill in the art will recognize that after theexpression cassette is stably incorporated in transgenic plants andconfirmed to be operable, it can be introduced into other plants bysexual crossing. Any of a number of standard breeding techniques can beused, depending upon the species to be crossed.

Transgenic plants containing the expression cassettes described hereincan be identified by using restriction enzymes or High PerformanceLiquid Chromatography. Techniques for restriction enzymes and HighPerformance Liquid Chromatography are well known to persons skilled theart. Transgenic plants containing the expression cassettes describedherein can be identified by using a Northern Blot analysis which is wellknown to persons skilled in the art.

Synthetic Polypeptides and Purification of Polypeptides

In addition to being produced recombinantly, the polypeptides of thepresent invention can also be produced synthetically, using techniquesknown in the art. For example, polypeptides having a length of about 50amino acids can be synthesized using solid phase synthesis techniques,such as those described by Barany and Merrifield, Solid-Phase PeptideSynthesis, pp. 3-284 in The Peptides. Analysis, Synthesis, Biology. Vol.2: Special Methods in Peptide Synthesis, Part A.; Merrifield et al., J.Am. Chem. Soc. 85:2149-2156 (1963). Polypeptides having a length greaterthan about 50 amino acids can be synthesized by condensation of theamino and carboxy termini of shorter fragments, a technique which iswell known to persons skilled in the art.

Polypeptides of the present invention produced either recombinantly orsynthetically, can be purified using standard techniques known to thosepersons skilled in the art, including, but not limited to, columnchromatography, selective precipitation with ammonium sulfate, affinitychromatography, etc.

Methods for Repressing the Expression or Inhibiting the Repression ofExpression of a Target Gene In Vivo

The Mez1 and Mez2 proteins belongs to the E(z) group of Polycombproteins. As discussed previously, it is known in the art that the escand esc-like (homologs) proteins interact with the E(z) and E(z)-likeproteins in vivo to form complexes. The E(z) and esc proteins interactwith each other, but are not known to physically interact with any othercharacterized PcG proteins. While C. elegans and plants contain homologsof the proteins in the E(z)/esc complex, they do not contain the PRC1complex. The E(z)/esc complex has been found to repress the expressionof a gene during a specific developmental stage and in a specific tissuein plants and C. elegans which lack the PRC1 complex (see Goodrich etal, Nature, 386(6620):44-51 (1997), Holdeman et al., Development,125(13):2457-67 (1998), Korf et al., Development, 125(13):2469-78(1998), Kelly and Fire, Development, 125(13):2451-6 (1998)).

The Mez1 and Mez2 nucleic acids and proteins of the present inventioncan be used for a number of useful purposes. First, the Mez1 and/or Mez2proteins can be used in a method to repress the expression of a desiredtarget gene in specific tissue in a plant in vivo. The gene targeted forsilencing would either be in cells expressing endogenous or introducedMez1 and/or Mez2 and ZmFIE proteins. The ZmFIE2 protein is an esc-likeprotein isolated from Zea mays L. and is described in copendingapplication U.S. Ser. No. 09/______ filed on Jul. 16, 2001 and entitled,“Polycomb Gene from Maize—ZmFIE2”, hereby incorporated by reference. TheMez1 and/or Mez2 nucleic acids and ZmFIE2 nucleic acids could beconstitutively expressed in these cells or introduced into a plantcontaining the cells by crossing. The gene targeted for silencing mayhave any of a number of different promoters, but would also contain DNAsequence motifs or contexts to which the Mez1 and/or Mez2/ZmFIE2 complexis targeted. This would allow silencing of a gene in specific tissues orat specific times in development. For example, immature roots contain anon-functional Mez2 protein, but a functional ZmFIE2 protein. Therefore,these cells would not silence an introduced or endogenous genecontaining DNA sequences which attract the Mez2/ZmFIE2 complex.Alternatively, developing leaf tissues contain a functional Mez2 andZmFIE2 protein. Therefore, an introduced or endogenous gene containingDNA sequences which attract the Mez2/ZmFIE2 complex would be silenced.

Alternatively, the Mez1 and Mez2 proteins of the present invention canbe used in a method to prevent the repression of a particular desiredtarget gene in vivo in a plant. One mechanism by which this could beaccomplished is by producing dominant negative mutant forms of said Mez1and Mez2 protein which fail to form a complex with any esc or esc-likeproteins. In this approach, the recombinant expression cassette encodesa mutant Mez1 and/or Mez2 polypeptide (the mutant polypeptide containvarious substitutions, deletions, additions, etc.) which fails to bindto any esc or esc-like proteins properly. Thereupon, the complex wouldnot form.

A second mechanism by which this could be accomplished is through theuse of antisense RNA. In this approach, recombinant expression cassettescontaining the Mez1 and/or Mez nucleic acids in the antisense directioncan be inserted into a plant. Preferably, the recombinant expressioncasettes contain a tissue-specific promoter which will direct expressionto the tissues containing the desired target gene of interest. Theantisense RNA produced by the expression cassette will hybridize withthe endogenous mRNA produced from the Mez1 or Mez2 genes within theplant, thus preventing the expression of any Mez1 or Mez2 protein.Because there will be no Mez1 or Mez2 protein, the complex between theMez1 and/or Mez2 proteins and any esc or esc-like proteins will fail toform.

The use of the Mez1 and Mez2 proteins of the present invention torepress the expression or prevent the repression of the expression of atarget gene in specific tissue in a plant in vivo could be used toregulate homeotic gene expression in plants to create novel plantshaving improved agronomic traits (see Goodrich et al, Nature,386(6620):44-51 (1997)).

The following Examples are offered by way of illustration, notlimitation.

EXAMPLE 1 Cloning and Characterization of the Mez1 and Mez2 Genes

Cloning of Mez1 and Mez2: Drosophila E(z) (AAC46462) was used in aTBLASTN search of the Pioneer Hi-Bred EST database. Two contigs withsignificant similarity were discovered, and named Maize E(z)-like 1(Mez1) and Maize E(z)-like 2 (Mez2). Other contigs containing a SETdomain were also present but displayed more similarity to trithorax thanto E(z). The ctsbp19 clone contained the 3′ 801 bp of Mez1. The Mez2contig originating from the cbmfe16 clone contained the 3′ 1144 bp ofthe Mez2 cDNA. To obtain full-length clones and sequence for the 5′region of both genes, Random Amplification of cDNA Ends (RACE) wasperformed. Additionally the 3′ end of Mez1 and Mez2 were obtained byRACE to verify the EST sequence. RACE reactions were performed onone-week seedling Mo17 cDNA using the Marathon cDNA kit (Clontech, PaloAlto CA) using Advantage2 polymerase (Clontech, Palo Alto Calif.). Theprimers used were as follows: Mez1F1—GGG TGT GGT GAT GGT ACA TTG G (SEQID NO:7), Mez1R2—CAG CTT GTC ACC CAT TCT GTA TGC G (SEQ ID NO:8),Mez2R3—TGC CTC GTC CTT CTT TGA TCC TTC G (SEQ ID NO:9)and Mez2F3—CTC ACAAGG AAG CAG ACA AAC GCG G (SEQ ID NO:10). RACE products were gelpurified and cloned into pGEM-T Easy (Promega, Madison Wis.).

Sequencing: The plasmids were sequenced using BigDye terminator cyclesequencing on an ABI sequencer (Perkin-Elmer Applied Biosystems).Sequencing reactions were done in a 10 μl volume with 320 ng DNA and 10pg of primer. Primers used were as follows: T7 (Promega), SP6 (Promega),Mez1F1, Mez1F2—TAC CTT GGT GAG TAC ACT GGG GAA C (SEQ ID NO:1 1),Mez1F4—CCA TTT CGT GTA TCA GAC CTA AGC (SEQ ID NO:12), Mez1F5—CAT CAACGC CCT CCA AGC (SEQ ID NO:13), Mez1R6—TGC CAC ATT CTT GAA CTG TCA TCC G(SEQ ID NO:14), Mez1R4—GCA CAG TGA CAT CCT CGA AAA CG (SEQ ID NO:15),Mez1R5—GTC CCT GCT CAA TTG CC (SEQ ID NO:16), Mez2F4—GCG GAC AAT TGT GCGGTT CG (SEQ ID NO:17), Mez2F5—GGT TGT TCA CAG AAT TTG G (SEQ ID NO:18),Mez2R4—CTT CCT AAC AAA ATC CTT TGC TGT TG (SEQ ID NO:19) and Mez2R5—TTGCTC CAT GTA GTC TTG (SEQ ID NO:20).

Sequence analysis: The sequences were assembled through the contigassembly program (http://gcg.tigem.it/ASSEMBLY/assemble.html). Reversecomplement, translation and ClustalW were all accessed from the ABCCsequence analysis page (http://biosci.cbs.umn.edu/seqanal/). ClustalWalignments were processed using Boxshade(http://www.ch.embnet.org/software/BOX_form.html). All BLAST searcheswere performed using the NCBI BLAST feature. For some searches theadvanced BLAST feature was used and a target organism was specified.Targeting signals and putative localization were predicted using PSORT(http://psort.nibb.ac.jp/). Domains were identified using SMART(http://smart.embl-heidelberg.de/).

Phylogenetic analysis: The SET domains from all E(z)-like proteins werealigned using ClustalW. This alignment was then submitted to the PHYLIPserver at http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html. Theprotpars feature was used with bootstrapping performed before analysis.One hundred replicates were examined to determine bootstrap values. Theconsensus tree was then displayed with bootstrap values.

RT-PCR analysis: Total RNA was extracted from tissues including embryo,leaf, immature ear, immature tassel, 3-day root, pollen and BMS (BlackMexican Sweet) suspension cultures using TRIzol (Life TechnologiesGibco/BRL). PolyA+ RNA, isolated using PolyAtract (Promega) was used tomake cDNA with Marathon cDNA Amplification Kit (Clontech). 2 ng of cDNAwas used in each PCR reaction. The primers used were: Mez1 FI, Mez1R1—CGG GAC CTA ACT CTA CGG ATG G (SEQ ID NO:21), Mez2F6—CGC AGC TGA TACGGC AAG TCC AAT CG (SEQ ID NO:22) and Mez2R2—GTA TCA TCC GGA GCG ACT CTTCAG C (SEQ ID NO:23). Cycling conditions were as follows: 94° 2′, 5cycles of 94° for 30″, 70° for 30″, 72° for 1′, 5 cycles of 94° for 30″,67.5° for 30″, 72° for 1′, then 25 cycles of 94° for 30″, 65° for 30″,72° for 1′followed by 72° for 7′. Each 25 μl reaction contained 1 μl ofa 10 μM primer solution for each primer, 2 ng cDNA, 2.5 μl 10× buffer, 2μl 25 mM MgCl₂, 0.3 μl 25 mM dNTP's (Promega), 0.2 μl Taq polymerase(Promega) and 17 μl ddH₂O.

Sequence analysis: The sequences were assembled through the contigassembly program (http://gcg.tigem.it/ASSEMBLY/assemble.html). Reversecomplement, translation and ClustalW were all accessed from the ABCCsequence analysis page (http://biosci.cbs.umn.edu/seqanal/). ClustalWalignments were processed using Boxshade(http://www.ch.embnet.org/software/BOX_form.html). All BLAST searcheswere performed using the NCBI BLAST feature. For some searches theadvanced BLAST feature was used and a target organism was specified.Targeting signals and putative localization were predicted using PSORT(http://psort.nibb.ac.jp/). Domains were identified using SMART(http://smart.embl-heidelberg.de/).

RESULTS

Mez1 and Mez2:

Two contigs with significant similarity to the Drosophila E(z) werediscovered in the Pioneer Hi-Bred EST database. These contigs were namedMaize E(z)-like 1 (Mez1 ) and Maize E(z)-like 2 (Mez2). To test for thepresence of Mez1 ESTs in the public maize database the Mez1 cDNA wasused in a BLASTN search (www.zmdb.iastate.edu). No Mez1 ESTs were found,but two putative trithorax hits were detected due to similarity of theE(z) and trithorax SET domains.

Mez1 was mapped to the short arm of chromosome 6 (bin 6.01-6.02). TheMez2 sequence was placed to the short arm of chromosome 9 (bin 9.04).Mutants with the phenotypes similar to the Arabidopsis clf or medea havenot been mapped to these regions.

Alignment of Mez1 and Mez2

The amino acid sequences of Mez1 and Mez2 were aligned using ClustalW(FIG. 3). The sequences are 42% identical and 56% similar over theirentire lengths. The nucleotide sequences of Mez1 and Mez2 are 52%identical. In maize, it is common to find two closely related sequencesdue to the ancient tetraploid nature of maize. Often the two sequencesthat arose from the tetraploid fusion display greater than 70%nucleotide identity (Gaut and Doebley, PNAS, U.S.A., 94:6809-6814(1997)). The lower identity of the Mez1 and Mez2 nucleotide sequencesindicates that these genes were probably duplicated prior to theformation of the maize tetraploidy event. In addition the map positionsof these two sequences do not correspond to colinear regions of themaize genome (Helentjaris, T., Maize Newsletter, 69:67-81 (1995)).

Characteristics of Mez1 and Mez2

A putative bipartite nuclear localization signal is found in both Mez1and Mez2 (See, FIGS. 4 and 5). Mez2 and Mez1 were aligned with the othercharacterized E(z)-like proteins using ClustalW (FIG. 4).

There are two regions near the C-terminal of the protein that are wellconserved among all E(z) proteins (FIG. 4 a). These are the Cys-richregion and the SET domain. The Cys-rich region has a number of highlyconserved cysteine residues. The spacing of the cysteine residues isunlike other Cys-rich zinc finger domains involved in DNA binding. Thefunction of this domain is not known but it is highly conserved amongall E(z) like genes. Mez1 is 45% identical to E(z) in this region whileMez2 is 46% identical. The SET (Su(var)3-9, Enhancer-of-zeste,Trithorax) domain found at the C-terminal end of the protein is alsohighly conserved. The SET domain of Mez1 is 55% identical to the E(z)SET domain (Mez2 is 54% identical). SET domains appear to be involved inmediating protein-protein interactions (Cui et al., Nat. Genet.,18:331-337 (1998); Huang et al., J. Biol. Chem., 273:15933-15939(1998)). Interestingly, the nonspecific transcriptional activator,trithorax, also contains a SET domain indicating that SET domains aloneare not responsible for transcriptional repression.

The Mez1 and Mez2 sequences were submitted to the SMART server toidentify other domains within these proteins (Schultz et al., PNAS USA,95:5857-5864 (1998); Schultz et al., Nucl. Acids Res., 28:231-234(2000)). In addition to the SET domain, a SANT (SWI3, ADA2, N-CoR andTFIIIB″ DNA-binding domains) domain was identified (FIGS. 4 and 5). Themyb-DNA binding domain is a SANT domain as well. This indicates thatplant E(z)-like genes have a domain that may facilitate DNA binding. TheSMART program also predicts the presence of a SANT domain in the animalE(z)-like proteins.

An acidic region is present in E(z)-like proteins near the N-terminalregion (FIGS. 4 and 5). The function of this domain is not known. Thisacidic region is conserved in all E(z)-like proteins. A small regionnear amino acid 250 of the plant E(z)-like proteins is highly conserved.This region, named CRRC region, is not recognized by the SMART program.The CRRC region is composed primarily of polar or charged residues.

Evolution of E(z) Sequences:

Arabidopsis contains at least three E(z)-like genes that performdistinct functions. The low degree of nucleotide similarity between Mez1and Mez2 indicates that these genes may have distinct evolutionaryorigins. The SET domain sequences of all E(z)-like proteins were alignedusing ClustalW. This alignment was then processed using PHYLIP and aparsimonious tree was constructed (FIG. 5). The tree shows grouping ofthe Arabidopsis clf and the maize Mez1. When the full-length proteinsequences were used for the alignments, the same tree was produced. Theresults indicate that Mez1 is a clf-like gene in maize while Mez2 islikely EZA1 homolog.

Alternative Splicing of Mez2:

In an attempt to generate a full length Mez2 clone, PCR primers in the5′ and 3′ UTR region were used to amplify B73 ear cDNA. In addition to amajor product of the expected size, two smaller products were observed(FIG. 6 a). These two products were excised and used for PCR reactionswith primers from various regions of the gene to detect where thedifference in size was arising. A region near the middle of Mez2 wasidentified and the PCR products from the two isoforms, Mez2 alternativesplice 1 (Mez2^(as1)) and Mez2 alternative splice 2 (Mez2^(as2)), weresequenced. Sequencing revealed that the smaller products were identicalto Mez2 except for the missing 659 base pairs in Mez2^(as1) and 810 basepairs in Mez2^(as2). The deleted fragment in Mez2^(as1) corresponds tobase pairs 1016 to basepairs 1676 of Mez2. The Mez2^(as1) deletion willcause a frameshift and a truncated protein of 341 amino acids (FIG. 6).The deletion in the Mez2^(as2) corresponds to basepairs 1016 tobasepairs 1827 of Mez2 and does not result in a frameshift. The deletionin Mez2^(as2) results in a 624 amino acid protein that is missing theSANT domain. It is possible that the presence of multiple products inthese PCR reactions is due to secondary structure of the RNA or aberrantPCR products. The presence of the products displaying identical sizeshifts in PCR reactions using multiple primers sets makes it unlikelythat these are the result of mispriming events. No significant secondarystructure was identified in these regions using secondary structureprediction programs. Together, these findings indicate that the presenceof multiple products is most likely due to alternative splicing of Mez2mRNA.

Expression of Mez1 and Mez2:

cDNA from various maize tissues was tested for the presence of Mez1 andMez2 transcripts. Abundant Mez1 transcripts were detected in embryo, earand root tissues (FIG. 7 a). Transcripts were also present in leaf, BMScell culture, and pollen tissues. There were no tissues tested that didnot contain Mez1 transcripts.

The same tissues were tested for the presence of Mez2 transcripts (FIG.7 b). The primers used to test for Mez2 expression flank the site ofalternative splicing documented in cDNA ear tissue. Amplification fromear cDNA revealed the presence of the three transcripts observedpreviously. In the lane amplified from embryo cDNA, a doublet ofMez2^(as2) and a smaller fragment is observed. The sequence of thissmaller fragment has not been analyzed. No Mez2 or Mez2^(as1)transcripts are observed in embryo tissue. Mez2 transcripts are thepredominant form in leaf tissue, with very faint Mez2^(as1) andMez2^(as2) products. An intense Mez2 product is amplified from immaturetassel cDNA. In addition, a Mez2^(as2) and two uncharacterized productsare present. Only Mez2^(as1) transcripts are detected in 3-day rootcDNA. Faint Mez2 and Mez2^(as2) products are observed from the BMS cellculture cDNA.

Mutator Insertions Into Mez2:

The Mez1 and Mez2 sequences were submitted to the Pioneer Hi-Bred Int'lTUSC system. The TUSC system is designed to find Mutator (Mu) insertionsin a sequence of interest. Difficulties were encountered in designingprimers to amplify the Mez1 sequence. Mez2 primers were designed andused to screen the DNA pools. Four independent insertions were found.The location of the four Mu insertions and five of the Mez2 introns areshown in FIG. 2 a. Mez2-Mu1 is an intron insertion while Mez2-Mu2,Mez2-Mu3 and Mez2-Mu4 are all exon insertions.

All references, patents and patent applications referred to herein arehereby incorporated by reference.

The present invention is illustrated by way of the foregoing descriptionand examples. The foregoing description is intended as a non-limitingillustration, since many variations will become apparent to thoseskilled in the art in view thereof. It is intended that all suchvariations within the scope and spirit of the appended claims beembraced thereby.

Changes can be made to the composition, operation and arrangement of themethod of the present invention described herein without departing fromthe concept and scope of the invention as defined in the followingclaims.

1. An isolated nucleic acid for repressing the expression of orinhibiting the repression of a target gene comprising a polynucleotideselected from the group consisting of SEQ ID NO:3 and a polynucleotidehaving at least 95% sequence identity to SEQ ID NO:3.
 2. (canceled) 3.(canceled)
 4. (canceled)
 5. (canceled)
 6. (canceled)
 7. (canceled) 8.(canceled)
 9. (canceled)
 10. (canceled)
 11. (canceled)
 12. (canceled)13. (canceled)
 14. An expression cassette comprising a promoter sequenceoperably linked to the nucleic acid of claim
 1. 15. The expressioncassette of claim 14 further comprising a polyadenylation signaloperably linked to the nucleic acid.
 16. The expression cassette ofclaim 14 wherein the promoter is a constitutive or tissue specificpromoter.
 17. A bacterial cell comprising the expression cassette ofclaim
 14. 18. The bacterial cell of claim 17 wherein the bacterial cellis an Agrobacterium tumefaciens cell or an Agrobacterium rhizogenescell.
 19. A plant cell transformed with the expression cassette of claim14.
 20. A transformed plant containing the plant cell of claim
 19. 21.The transformed plant of claim 20 wherein the plant is Zea mays.
 22. Aseed that contains the expression cassette of claim
 14. 23. Atransformed seed of the transformed plant of claim 20.